Reduction of Intermediate Alphabets in Finite-State Transducer Cascades

نویسنده

  • André Kempe
چکیده

This article describes an algorithm for reducing the intermediate alphabets in cascades of finite-state transducers (FSTs). Although the method modifies the component FSTs, there is no change in the overall relation described by the whole cascade. No additional information or special algorithm, that could decelerate the processing of input, is required at runtime. Two examples from Natural Language Processing are used to illustrate the effect of the algorithm on the sizes of the FSTs and their alphabets. With some FSTs the number of arcs and symbols shrank considerably.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimum Inferred Finite State

11 More work should be done on learning homomorphisms into larger alphabets. More importantly, it would be interesting to nd natural linguistic limitations of the type of morphological transformations such that the resultant learning problem would become tractable. References 1] Angluin, D. and C. Smith, \Inductive inference: theory and methods". Comput. QUESTION: Is there a K-state determinist...

متن کامل

Variable Automata over Infinite Alphabets

Automated reasoning about systems with infinite domains requires an extension of automata, and in particular, regular automata, to infinite alphabets. Existing formalisms of such automata cope with the infiniteness of the alphabet by adding to the automaton a set of registers or pebbles, or by attributing the alphabet by labels from an auxiliary finite alphabet that is read by an intermediate t...

متن کامل

Efficient Online k-Best Lookup in Weighted Finite-State Cascades

Weighted finite-state transducers (WFSTs) have proved to be powerful and efficient aids for a variety of natural-language processing tasks, including automatic phonetization and phonological rule systems (Kaplan & Kay, 1994; Laporte, 1997), morphological analysis (Geyken & Hanneforth, 2006), and shallow syntactic parsing (Roche, 1997). In particular, cascades arising from the composition of two...

متن کامل

Minimization of Symbolic Transducers

Symbolic transducers extend classical finite state transducers to infinite or large alphabets like Unicode, and are a popular tool in areas requiring reasoning over string transformations where traditional techniques do not scale. Here we develop the theory for and an algorithm for computing quotients of such transducers under indistinguishability preserving equivalence relations over states su...

متن کامل

KNG: a Tool for Writing Easily Transducer Cascades (KNG: un outil pour l'écriture facile de cascades de transducteurs) [in French]

Résumé. Cet article présente une bibliothèque python appelée KNG permettant d’écrire facilement des automates et transducteurs finis. Grâce à une gestion soigneuse des codages et des entrées-sorties, cette bibliothèque permet de réaliser une cascade de transducteurs au moyen de tubes unix reliant des scripts python. Abstract. This paper presents a Python library called KNG which provides facili...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره cs.CL/0010030  شماره 

صفحات  -

تاریخ انتشار 2000